Skip to content

Conversation

@mohammedahmed18
Copy link
Contributor

@mohammedahmed18 mohammedahmed18 commented Sep 25, 2025

PR Type

Bug fix, Enhancement


Description

  • Standardize LSP message delimiter handling

  • Refactor tag extraction and defaults

  • Align logging flow with LSP enablement

  • Minor cleanup in optimizer logs


Diagram Walkthrough

flowchart LR
  L1["lsp_message: define message_delimiter"] -- "import" --> L2["lsp_logger: use delimiter to detect JSON"]
  L2 -- "extract tags & normalize" --> L3["enhanced_log: normal vs LSP path"]
  L3 -- "serialize LSP text" --> L4["LspTextMessage.serialize: wrap with delimiter"]
Loading

File Walkthrough

Relevant files
Enhancement
lsp_logger.py
Refactor LSP-aware logging and tag parsing                             

codeflash/lsp/lsp_logger.py

  • Import and use message_delimiter.
  • Make extract_tags robust, return default tags.
  • Rework logging flow for LSP vs normal handling.
  • Apply heading/highlight tags with defaults.
+32/-28 
Bug fix
lsp_message.py
Centralize and apply LSP message delimiter                             

codeflash/lsp/lsp_message.py

  • Introduce shared message_delimiter.
  • Serialize messages wrapped by delimiter on both sides.
  • Simplify ordering comment and usage.
+4/-5     
Miscellaneous
function_optimizer.py
Trim redundant profiling log output                                           

codeflash/optimization/function_optimizer.py

  • Remove redundant line profiling info log.
+0/-1     



def extract_tags(msg: str) -> tuple[Optional[LspMessageTags], str]:
def extract_tags(msg: str) -> tuple[LspMessageTags, str]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was optimized by codeflash extension

@github-actions
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Behavior Change

The logic around handling unsupported log levels and '!lsp' changed: previously such messages were dropped in LSP mode; now they are logged normally unless 'force_lsp' is set. Verify this aligns with desired behavior and does not spam logs when LSP is enabled.

unsupported_level = level not in supported_lsp_log_levels

# ---- Normal logging path ----
if not tags.lsp:
    if not lsp_enabled:  # LSP disabled
        actual_log_fn(clean_msg, *args, **kwargs)
        return
    if tags.not_lsp:  # explicitly marked as not for LSP
        actual_log_fn(clean_msg, *args, **kwargs)
        return
    if unsupported_level and not tags.force_lsp:  # unsupported level
        actual_log_fn(clean_msg, *args, **kwargs)
        return
Tag Parsing

Tag extraction now only triggers when exactly one '|' exists; messages with multiple '|' or leading/trailing delimiters will skip tag parsing. Confirm this is intentional and won’t miss valid tags in edge cases.

def extract_tags(msg: str) -> tuple[LspMessageTags, str]:
    delimiter = "|"
    first_delim_idx = msg.find(delimiter)
    if first_delim_idx != -1 and msg.count(delimiter) == 1:
        tags_str = msg[:first_delim_idx]
        content = msg[first_delim_idx + 1 :]
        tags = {tag.strip() for tag in tags_str.split(",")}
        message_tags = LspMessageTags()
        # manually check and set to avoid repeated membership tests
        if "lsp" in tags:
            message_tags.lsp = True
        if "!lsp" in tags:
            message_tags.not_lsp = True
        if "force_lsp" in tags:
            message_tags.force_lsp = True
        if "loading" in tags:
            message_tags.loading = True
        if "highlight" in tags:
            message_tags.highlight = True
        if "h1" in tags:
            message_tags.h1 = True
        if "h2" in tags:
            message_tags.h2 = True
        if "h3" in tags:
            message_tags.h3 = True
        if "h4" in tags:
            message_tags.h4 = True
        return message_tags, content

    return LspMessageTags(), msg
Delimiter Detection

JSON detection switched to checking for a leading delimiter; ensure all producers of LSP messages are wrapped with the delimiter on both sides, and that no plain text could accidentally start with the delimiter character.

# \u241F is the message delimiter becuase it can be more than one message sent over the same message, so we need something to separate each message
message_delimiter = "\u241f"


@dataclass
class LspMessage:
    # to show a loading indicator if the operation is taking time like generating candidates or tests
    takes_time: bool = False

    def _loop_through(self, obj: Any) -> Any:  # noqa: ANN401
        if isinstance(obj, list):
            return [self._loop_through(i) for i in obj]
        if isinstance(obj, dict):
            return {k: self._loop_through(v) for k, v in obj.items()}
        if isinstance(obj, json_primitive_types) or obj is None:
            return obj
        if isinstance(obj, Path):
            return obj.as_posix()
        return str(obj)

    def type(self) -> str:
        raise NotImplementedError

    def serialize(self) -> str:
        data = self._loop_through(asdict(self))
        ordered = {"type": self.type(), **data}
        return message_delimiter + json.dumps(ordered) + message_delimiter

@github-actions
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Avoid LSP serialization when disabled

When LSP is disabled but a message is tagged lsp, the code falls through into the
LSP path and serializes to an LSP-wrapped format, effectively dropping user-visible
logs. Short-circuit to normal logging if lsp_enabled is False, unless force_lsp is
set. This aligns behavior with the prior logic and avoids silent log loss.

codeflash/lsp/lsp_logger.py [102-119]

 # ---- Normal logging path ----
 if not tags.lsp:
-    if not lsp_enabled:  # LSP disabled
+    if not lsp_enabled:
         actual_log_fn(clean_msg, *args, **kwargs)
         return
-    if tags.not_lsp:  # explicitly marked as not for LSP
+    if tags.not_lsp:
         actual_log_fn(clean_msg, *args, **kwargs)
         return
-    if unsupported_level and not tags.force_lsp:  # unsupported level
+    if unsupported_level and not tags.force_lsp:
         actual_log_fn(clean_msg, *args, **kwargs)
         return
+
+# If LSP is disabled, avoid serializing to LSP unless forced
+if not lsp_enabled and not tags.force_lsp:
+    actual_log_fn(clean_msg, *args, **kwargs)
+    return
 
 # ---- LSP logging path ----
 if is_normal_text_message:
     clean_msg = add_heading_tags(clean_msg, tags)
     clean_msg = add_highlight_tags(clean_msg, tags)
     clean_msg = LspTextMessage(text=clean_msg, takes_time=tags.loading).serialize()
Suggestion importance[1-10]: 8

__

Why: As written, tags.lsp messages may still get serialized when LSP is disabled, dropping user-visible logs; adding a short-circuit preserves logging behavior unless explicitly forced. This prevents functional regressions and aligns with prior logic.

Medium
Robust LSP message detection

Relying on startswith(message_delimiter) to detect LSP messages is brittle and
misclassifies plain text that happens to start with the delimiter. Instead, detect
messages that are actually wrapped by the delimiter on both ends. This prevents
false positives and ensures the normal text path is used correctly.

codeflash/lsp/lsp_logger.py [92-94]

-is_lsp_json_message = msg.startswith(message_delimiter)
-is_normal_text_message = not is_lsp_json_message
+is_lsp_message_wrapped = msg.startswith(message_delimiter) and msg.endswith(message_delimiter)
+is_normal_text_message = not is_lsp_message_wrapped
Suggestion importance[1-10]: 7

__

Why: Detecting only with startswith can misclassify plain text; checking both prefix and suffix better matches the new framing and reduces false positives. This is a reasonable improvement, though not critical.

Medium
Preserve delimiter compatibility

Prefixing and suffixing with the delimiter breaks backward compatibility if any
consumers still expect a single trailing delimiter. If the ecosystem expects
trailing-only delimiters, keep the trailing form and avoid leading delimiters to
prevent parsing failures. Alternatively, version the protocol before changing
framing.

codeflash/lsp/lsp_message.py [39]

 def serialize(self) -> str:
     data = self._loop_through(asdict(self))
     ordered = {"type": self.type(), **data}
-    return message_delimiter + json.dumps(ordered) + message_delimiter
+    return json.dumps(ordered) + message_delimiter
Suggestion importance[1-10]: 4

__

Why: The concern about backward compatibility is valid in general, but the PR intentionally changes framing to add both leading and trailing delimiters and logger logic aligns with it. Without evidence of consumers requiring trailing-only, this is low-impact and potentially counter to the PR intent.

Low

@Saga4 Saga4 enabled auto-merge (rebase) September 27, 2025 01:58
@Saga4 Saga4 merged commit e973c69 into main Sep 27, 2025
18 of 19 checks passed
@KRRT7 KRRT7 deleted the lsp/fix-delimiter-and-lsp-tag branch September 27, 2025 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants